A survey of variable selection methods and multiclass learning in bio informatics
نویسندگان
چکیده
Feature selection based data mining methods is one of the most important research directions in the fields of machine learning in recent years. This paper presents a review of assorted feature selection methods named filter, wrapper and embedded and multiclass classifiers like support vector machines (SVM), decision tree, averaged perceptron and neural network. Additionally it conveys an assessment of classifiers for breast cancer dataset. Index Terms Feature selection, Bio informatics, Machine learning, and Multiclass classification. _______________________________________________________________________________________________________
منابع مشابه
An Evaluation of Feature Selection Methods for Multiclass Learning in Bio Informatics
Traditional data mining techniques such as classification or clustering have demonstrated achievement in datasets which has multiple instances in singly relation but while extreme point of dimensionality or complex dependencies presents in the data it fails to offer accuracy and correctness. In solution to this, Feature (attribute/variable) selection techniques since last two decades have verif...
متن کاملSparsity Regularization for classification of large dimensional data
Feature selection has evolved to be a very important step in several machine learning paradigms. Especially in the domains of bio-informatics and text classification which involve data of high dimensions, feature selection can help in drastically reducing the feature space. In cases where it is difficult or infeasible to obtain sufficient training examples, feature selection helps overcome the ...
متن کاملEfficient Feature Selection and Multiclass Classification with Integrated Instance and Model Based Learning
Multiclass classification and feature (variable) selections are commonly encountered in many biological and medical applications. However, extending binary classification approaches to multiclass problems is not trivial. Instance-based methods such as the K nearest neighbor (KNN) can naturally extend to multiclass problems and usually perform well with unbalanced data, but suffer from the curse...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملCorrection: Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data
The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning...
متن کامل